Deep Interactive Region Segmentation and Captioning

机译：深度交互区域分割和字幕

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With recent innovations in dense image captioning, it is now possible todescribe every object of the scene with a caption while objects are determinedby bounding boxes. However, interpretation of such an output is not trivial dueto the existence of many overlapping bounding boxes. Furthermore, in currentcaptioning frameworks, the user is not able to involve personal preferences toexclude out of interest areas. In this paper, we propose a novel hybrid deeplearning architecture for interactive region segmentation and captioning wherethe user is able to specify an arbitrary region of the image that should beprocessed. To this end, a dedicated Fully Convolutional Network (FCN) namedLyncean FCN (LFCN) is trained using our special training data to isolate theUser Intention Region (UIR) as the output of an efficient segmentation. Inparallel, a dense image captioning model is utilized to provide a wide varietyof captions for that region. Then, the UIR will be explained with the captionof the best match bounding box. To the best of our knowledge, this is the firstwork that provides such a comprehensive output. Our experiments show thesuperiority of the proposed approach over state-of-the-art interactivesegmentation methods on several well-known datasets. In addition, replacementof the bounding boxes with the result of the interactive segmentation leads toa better understanding of the dense image captioning output as well as accuracyenhancement for the object detection in terms of Intersection over Union (IoU).

机译：随着密集图像字幕的最新创新，现在可以用字幕描述场景中的每个对象，而对象是由边界框确定的。但是，由于存在许多重叠的边界框，因此对此类输出的解释并非无关紧要。此外，在当前字幕框架中，用户不能涉及个人偏好以排除在兴趣区域之外。在本文中，我们提出了一种新颖的混合式深度学习架构，用于交互式区域分割和字幕制作，用户可以在其中指定应处理图像的任意区域。为此，使用我们的特殊训练数据来训练名为Lyncean FCN（LFCN）的专用全卷积网络（FCN），以隔离用户意图区域（UIR）作为有效分段的输出。并行地，使用密集图像字幕模型为该区域提供各种各样的字幕。然后，将用最佳匹配边界框的标题说明UIR。就我们所知，这是提供如此全面输出的第一笔工作。我们的实验表明，在一些知名数据集上，该方法优于最新的交互式细分方法。另外，用交互式分割的结果替换边界框可以更好地理解密集图像字幕输出，并可以更好地理解“联合交叉”（IoU）上的对象检测。

著录项

作者
Boroujerdi, Ali Sharifi; Khanian, Maryam; Breuss, Michael;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning [J] . Ariyo Oluwasammi, Muhammad Umar Aftab, Zhiguang Qin, Complexity . 2021,第a期

机译：文字的功能：对语义分割和图像标题深度学习的全面调查
2. Hough-CNN: Deep learning for segmentation of deep brain regions in MRI and ultrasound [J] . Fausto Milletari, Seyed-Ahmad Ahmadi, Christine Kroll, Computer vision and image understanding . 2017,第NOVa期

机译：Hough-CNN：在MRI和超声检查中对深度脑区域进行分割的深度学习
3. Deep pancreas segmentation with uncertain regions of shadowed sets [J] . Zheng Haiyan, Chen Yufei, Yue Xiaodong, Magnetic resonance imaging: An International journal of basic research and clinical applications . 2020,第1期

机译：阴影区域不确定地区的深度胰腺细分
4. Deep Interactive Region Segmentation and Captioning [C] . Ali Sharifi Boroujerdi, Maryam Khanian, Michael Breuß International Conference on Signal-Image Technology and Internet-Based Systems . 2017

机译：深度互动区域分割和字幕
5. Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation [D] . Hatamizadeh, Ali. 2020

机译：自动图像分割的统一区域，边缘和轮廓模型深入学习
6. Burn image segmentation based on Mask Regions with Convolutional Neural Network deep learning framework: more accurate and more convenient [O] . Chong Jiao, Kehua Su, Weiguo Xie, 2019

机译：使用卷积神经网络深度学习框架基于蒙版区域进行图像分割：更准确更便捷
7. Hough-CNN: Deep Learning for Segmentation of Deep Brain Regions in MRI and Ultrasound [O] . Milletari, Fausto, Ahmadi, Seyed-Ahmad, Kroll, Christine, 2016

机译：Hough-CNN：mRI中深部脑区域分割的深度学习和超声波

Deep Interactive Region Segmentation and Captioning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅